Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Provide a pyFAI Geometry Optimization Task #59

Draft
wants to merge 316 commits into
base: dev
Choose a base branch
from

Conversation

LouConreux
Copy link

@LouConreux LouConreux commented Oct 16, 2024

Description

This PR provides a geometry refinement task using pyFAI enhanced by Bayesian Optimization.

The code mainly originates from what has been developped in btx/feature_pyFAI.

Checklist

  • Implement task parameters OptimizePyFAIGeometryParameters
    • Write geom_opt.py under lute.io.models
    • Create TaskParameters object and define relevant parameters for input tasks
    • Add nested parameter section for Bayesian Optimization hyperparameter definition
  • Implement First-Party OptimizePyFAIGeometry task
    • Write geom_opt.py under lute.tasks
    • Implement OptimizePyFAIGeometry task according to btx/feature_pyFAI code
    • Check code for fetching geometry
      • CalibFileFinder needs to be pointed to a source: exhaustive list
    • Code for computing powder for workflow definition
      • Convert hdf5 to npy
    • Change LCLSGeom detector definition for Rayonix binning
    • Add relevant _report_to_executor prints
      • Check logger to omit pyFAI optimization infos
    • Find way to retrieve wavelength and calibrant info
  • Implement managed task PyFAIGeometryOptimizer
    • Use MPIExecutor

PR Type:

  • New feature/Enhancement

Address issues:

  • N/A

Testing

Using the configuration YAML below for mfxx49820 run 8

%YAML 1.3
---
title: "LUTE Task Configuration" # Include experiment description if desired
experiment: "mfxx49820"
run: 8
date: "2023/10/25"
lute_version: 0.1      # Do not be change unless need to force older version
task_timeout: 600
work_dir: "/sdf/data/lcls/ds/mfx/mfxx49820/scratch/lconreux"
...
---
OptimizePyFAIGeometry:
  #exp: "mfxx49820"
  #run: 8
  det_type: "epix10k2M"
  in_file: "/sdf/data/lcls/ds/mfx/mfxx49820/scratch/lconreux/geom/0-end.data"
  powder: "/sdf/data/lcls/ds/mfx/mfxx49820/scratch/lconreux/powder/r0008_max.npy"
  calibrant: "AgBh"
  bo_params:
    bounds:
      dist: 0.2
      poni1: [-0.01, 0.01]
      poni2: [-0.01, 0.01]
    res: 0.0002
    #n_samples: 50
    #n_iterations: 50
    #prior: True
    #af: "ucb"
    #hyperparams:
      #beta: 1.96
      #epsilon: 0.01

SubmitSMD:
  # Command line arguments
  producer: "/sdf/data/lcls/ds/mfx/mfxx49820/scratch/dorlhiac/smalldata_tools/lcls1_producers/smd_producer.py"
  #run: 99
  experiment: "mfxx49820"
  #stn: 0
  directory: "/sdf/data/lcls/ds/mfx/mfxx49820/scratch/lconreux/smd_output"
  #gather_interval: 25
  #norecorder: False
  #url: "https://pswww.slac.stanford.edu"
  #epicsAll: False
  #full: False
  #fullSum: False
  #default: true
  #image: False
  #tiff: False
  #centerpix: False
  #postRuntable: False
  #wait: False
  #xtcav: False
  #noarch: False
  # Producer variables. These are substituted into the producer to run specific
  # data reduction algorithms. Uncomment and modify as needed.
  # If you prefer to modify the producer file directly, leave commented.
  # Beginning with `getROIs`, you will need to modify the first entry to be a
  # detector. This detector MUST MATCH one of the detectors in `detnames`.
  # In the future this will be automated. If you have multiple detectors you can
  # add them with their own set of parameters.
  detnames: ["epix10k2M"]
  detSumAlgos:
    all:
      - "calib"
      - "calib_dropped"
      - "calib_dropped_square"
      - "calib_thresADU1"
    epix10k2M:
      - "calib_thresADU5"
      - "calib_max"
    Rayonix:
      - "calib_skipFirst_thresADU1"
      - "calib_skipFirst_max"
...

Nota Bene

Depending on the yaml bounds input, the geometry optimizer is scanning distances 100mm around a distance guess if bounds["dist"] is a float, or in between two bounds if bounds["dist"] is a tuple. If providing a distance guess, I recommend running the task with --ntasks=102 so that a 1mm step is used between scanned distances. If providing distance bounds, set --ntasks=N+2 where N is the desired distance step size.

Nota Bene Bene

This implementation depends on the the LouConreux fork of lute since in that version, MPI tasks are mapped by core using the special argument in lute.execution.executor in _submit_cmd: mpi_cmd: str = f"mpirun -np {nprocs} --map-by core"

Task can be run independently

/sdf/home/l/lconreux/lute/launch_scripts/submit_slurm.sh -t PyFAIGeometryOptimizer -c /sdf/data/lcls/ds/mfx/mfxx49820/scratch/lconreux/yamls/mfxx49820_lute.yaml --partition=milano --ntasks=102 --account=lcls:mfxx49820

Task can be run in workflow

/sdf/home/l/lconreux/lute/launch_scripts/submit_launch_airflow.sh /sdf/home/l/lconreux/lute/launch_scripts/launch_airflow.py -w geom_opt_pyfai -c /sdf/data/lcls/ds/mfx/mfxx49820/scratch/lconreux/yamls/mfxx49820_lute.yaml -e mfxx49820 -r 8 --partition=milano --ntasks=102--account=lcls:mfxx49820--debug --test

Outputs

Geometry files are produced inside a directory defined as such in_file.replace('0-end.data', f'{run}-end.data).
Summary plots are saved in work_dir/figs/ and posted on eLog.
Logger writes the final geometry at the end of optimization.

Screenshots

bayes_opt_geom_mfxx49820_r0008

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants